Developing Decision Trees for Handling Uncertain Data

نویسندگان

  • P. Satya Prakash
  • P. Jhansi Lakshmi
  • B. L Krishna
چکیده

Classification is one of the most efficient and widely used data mining technique. In classification, Decision trees can handle high dimensional data, and their representation is intuitive and generally easy to assimilate by humans. Decision trees handle the data whose values are certain. We extend such classifiers i.e, decision trees to handle uncertain information. Value uncertainty arises in many applications during the data collection process. Example sources of uncertainty include data staleness, and multiple repeated measurements. With uncertainty, the value of a data item is often represented not by one single value, but by multiple values forming a probability distribution (pdf’s). Rather than abstracting uncertain data by statistical derivatives (such as mean and median), we extend classical decision tree building algorithms to handle data tuples with uncertain values. Extensive experiments have been conducted that show that the resulting classifiers are more accurate than those using value averages.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...

متن کامل

Decision Trees for handling Uncertain Data to identify bank Frauds

Classification is a classical problem in machine learning and data mining. In traditional decision tree classification, a feature of a tuple is either categorical or numerical. The decision tree algorithms are used for classify the certain and numerical data for many applications. In existing system they implement the extended the model of decision tree classification to accommodate data tuple ...

متن کامل

Classification of Categorical Uncertain Data Using Decision Tree

Certain data is a data whose values are known precisely whereas uncertain data means whose value are not known precisely. But data is always uncertain in real life applications. In data uncertainty attribute value is represented by a set of values. There are two types of attributes in data sets namely, numerical and categorical attributes. Data uncertainty can arise in both numerical and catego...

متن کامل

Handling uncertain labels in multiclass problems using belief decision trees

This paper investigates the induction of decision trees based on the theory of belief functions. This framework allows to handle training examples whose labeling is uncertain or imprecise. A former proposal to build decision trees for twoclass problems is extended to multiple classes. The method consists in combining trees obtained from various two-class coarsenings of the initial frame.

متن کامل

Interpretable Predictive Models for Knowledge Discovery from Home-Care Electronic Health Records

The purpose of this methodological study was to compare methods of developing predictive rules that are parsimonious and clinically interpretable from electronic health record (EHR) home visit data, contrasting logistic regression with three data mining classification models. We address three problems commonly encountered in EHRs: the value of including clinically important variables with littl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012